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SPARSE  REPRESENTATIONS  FOR  THREE-DIMENSIONAL  RANGE  DATA  RESTORATION 
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Minneapolis,  MN  55455 


ABSTRACT 

Sparse  representations  of  signals,  in  particular  with 
learned  dictionaries,  are  widely  used  for  state-of-the-art  au¬ 
dio,  image,  and  video  restoration.  In  this  paper,  the  problem 
of  denoising  and  occlusion  restoration  of  3D  range  data  based 
on  dictionary  learning  and  sparse  representations  is  explored. 
We  consider  the  3D  surface  obtained  from  a  desktop  range 
scanner  as  an  image,  where  the  value  of  each  pixel  repre¬ 
sents  the  depth  of  a  point  on  the  3D  surface.  Having  this 
image,  we  apply  techniques  from  dictionary  learning  and 
sparse  representation  to  enhance  the  acquired  3D  surface. 
These  techniques  use  the  spare  decomposition  of  the  over¬ 
lapping  patches  in  the  image,  over  an  adapted  over-complete 
dictionary,  for  enhancing  the  data.  We  present  experimental 
results  of  denoising  3D  surfaces  following  this  approach.  We 
also  propose  an  algorithm  for  filling  the  missing  informa¬ 
tion  regions  on  3D  scans  and  demonstrate  its  effectiveness. 
Our  experimental  results  are  on  range  data  obtained  from  a 
low-cost  structured-light  range  scanner. 

Index  Terms —  Sparse  representation,  3D  surface  denois¬ 
ing,  Occlusion  restoration. 

1.  INTRODUCTION 

Three-dimensional  (3D)  data  is  becoming  ubiquitous.  How¬ 
ever  models  obtained  from  3D  scanners  have  imperfections. 
For  example,  the  raw  data  obtained  from  a  low-cost  3D  range 
scanner  is  usually  noisy  and  may  have  some  occlusions  or 
missing  parts.  Thus,  there  is  an  increasing  need  for  meth¬ 
ods  for  denoising  and  occlusion  restoration  of  3D  surfaces 
in  general  and  range  data  in  particular.  Recently,  techniques 
based  on  dictionary  learning  for  sparse  representation  have 
been  widely  used  for  image  and  video  restoration  [1,  2,  3]. 
In  these  methods,  a  dictionary  is  learned  on  the  (overlapping) 
patches  of  the  image,  sparsely  representing  those  patches,  that 
is,  each  patch  of  the  image  can  be  well  approximated  only 
with  a  few  atoms  from  the  learned  dictionary.  In  the  works 

This  work  is  partially  supported  by  ARO,  NGA,  ONR,  DARPA,  and 
NSF.  We  thank  I.  Ramirez  and  J.  Mairal  for  providing  the  code  for 
Lasso/LARS  and  OMP  algorithms,  F.  Lecumberry  for  his  help  in  data  collec¬ 
tion,  and  E.  Gordon  for  building  the  structured-light  scanner  and  installing  it 
in  our  lab. 


mentioned  above,  it  has  been  shown  that  sparsely  represent¬ 
ing  overlapping  patches  in  the  image  with  such  learned  dic¬ 
tionaries,  and  then  combining  them  to  reconstruct  the  image, 
results  in  an  effective  image  denoising  method. 

In  this  work,  we  apply  the  framework  of  learned  sparse 
representations  in  order  to  restore  3D  surfaces,  range  data  in 
particular.  We  also  propose  a  new  framework  for  filling  miss¬ 
ing  information  parts  in  3D  surfaces  based  on  ideas  similar 
to  those  presented  in  [2].  First,  having  a  3D  surface  scanned 
by  a  3D  range  scanner,  we  convert  it  to  an  image  whose  pixel 
values  represent  the  depth  of  each  point  corresponding  to  that 
pixel.  Then,  this  image  is  denoised  using  the  combination 
of  the  sparse  representations  of  its  fully  overlapping  patches 
based  on  the  dictionary  learned  on  the  patches  from  the  noisy 
data.  In  order  to  obtain  the  denoised  3D  surface,  we  regener¬ 
ate  the  3D  surface  from  the  image  by  placing  a  point  (x,  y,  z) 
on  the  3D  surface  corresponding  to  each  pixel  (x,  y)  with  in¬ 
tensity  z  in  the  image.  We  also  introduce  an  iterative  method 
to  fill  the  holes  of  missing  information  in  the  range  data  by 
applying  the  same  sparse  representation  method  with  reduced 
influence  of  the  actual  holes  while  estimating  the  representa¬ 
tion.  In  the  experiments  we  show  the  very  good  results  ob¬ 
tained  with  this  method  for  a  low-cost  scanner. 

The  remainder  of  this  paper  is  organized  as  follows.  In 
Section  2,  the  core  algorithm  for  denoising  3D  surfaces  is 
presented.  The  method  for  filling  the  occlusions  and  miss¬ 
ing  information  is  introduced  in  Section  3,  followed  by  the 
experimental  results  in  Section  4.  Finally  we  conclude  the 
paper  in  Section  5. 


2.  SPARSE  REPRESENTATION  METHODS  IN  3D 
RANGE  DATA  RESTORATION 

In  this  section,  we  explain  the  basic  algorithm  we  use  for  de¬ 
noising  3D  surfaces.  In  Section  2.1,  the  data  collection  pro¬ 
cess  and  the  preprocessing  of  the  data  are  explained.  Some 
denoising  algorithms  for  image  processing  based  on  learned 
sparse  representation  are  reviewed  in  Section  2.2,  along  with 
the  details  on  the  specific  sparse  representation  algorithms  we 
use  for  3D  surface  restoration. 


2.1.  Data  Collection  and  Preprocessing 

In  this  work,  we  apply  our  restoration  method  to  restore  the 
range  data  collected  from  a  low-cost  structured-light  3D  scan¬ 
ner  [4].  Such  low-cost  family  of  devices  produce  relatively 
noisy  range  data,  as  well  as  regions  with  missing  information 
due  to  occlusions  or  lack  of  light  reflection.  In  addition,  com¬ 
monly  used  horizontal  stripe  patterns  in  the  projected  light 
add  noise  to  the  data  with  the  shape  of  horizontal  lines  (Fig. 
1,  third  column).  In  order  to  enhance  the  3D  data  obtained 
from  this  scanner,  we  first  convert  the  points  on  the  shape  to 
an  image  parallel  to  the  camera  matrix,  each  point  with  coor¬ 
dinates  (x,  y,  z)  corresponds  to  a  pixel  (x,  y)  in  the  image.  We 
define  the  pixel  value  of  the  image  as  an  affine  function  of  the 
value  of  z,  which  is  the  distance  from  the  camera,  of  the  cor¬ 
responding  point.  Having  this  natural  image  representation, 
we  apply  the  restoration  methods  explained  in  the  following 
sections  for  denoising  or  filling  the  missing  information  parts. 
Then,  we  show  in  Section  4  that  if  we  convert  the  restored 
image  back  to  3D  points,  the  result  will  be  an  enhanced  3D 
shape. 

2.2.  Denoising  Surfaces  Using  Sparse  Representation 

In  this  section,  we  explain  in  detail  the  method  we  propose  for 
denoising  images  obtained  from  3D  scans.  Our  work  is  based 
on  the  algorithms  for  image  restoration  using  learned  sparse 
representations  (see  [2]  for  example). 

Assume  xq  is  the  clean  image  reshaped  in  a  vector  of  size 
N  and  x  is  the  noisy  version  of  xq.  Having  x,  we  want  to 
find  the  dictionary  D  that  “best”  represents  the  patches  in  x. 
In  order  to  find  D,  the  following  optimization  problem  is  ad¬ 
dressed; 

||Day  -  R*jx||2, 

subject  to  \\di\\2  =  1(/  =  l..k)  and  \aij\p  <  L, 

where  T  is  a  given  constant;  p  =  0, 1  and  |  •  |p  stand  for  the 
Ip  norm;  D  is  the  dictionary  being  learned,  with  k  atoms  of 
length  iV;  is  the  vector  of  size  k  coefficients  correspond¬ 
ing  to  the  patch  at  location  [i,j],  indicating  the  weight  of  each 
atom  from  D  in  the  reconstruction  of  the  patch;  and  the  binary 
matrix  R,  j  extracts  the  patch  at  location  [i,j]  from  the  image. 
The  minimization  is  performed  over  the  dictionary  D  and  the 
coding  coefficients  a. 

Algorithm  1  summarizes  the  general  approach  used  to 
solve  this  non-convex  problem  (the  problem  is  convex  on 
each  variable  when  p  =  1  but  not  on  both  at  the  same  time). 

In  this  work,  we  use  the  unconstrained  li  penalty, 

||Day  -  Rijx||2  +  A  ,  (2) 

for  each  pair  of  [z,  j].  In  order  to  solve  this  optimization  prob¬ 
lem  we  used  the  LARS-Lasso  algorithm  [6],  which  is  one  of 
the  most  efficient  algorithms  in  the  literature  for  li  penalty 


Algorithm  1  Image  restoration  based  on  sparse  representa¬ 
tion _ 

Initialization:  Let  D  =  be  some  initial  dictionary. 

Dictionary  Learning:  Repeat  J  times  or  until  convergence 

•  Sparse  Coding:  When  D  is  fixed,  solve  the  optimization  prob¬ 
lem  (Equation  (1))  to  find  the  coefficients  aij.  This  problem  is 
convex  for  p  =  1  and  can  be  addressed  using  LARS,  LASSO, 
soft-thresholding,  etc.  For  p  =  0,  Orthogonal  Matching  Pursuit  is 
commonly  used. 

•  Dictionary  Update:  In  this  step,  we  update  the  dictionary  based  on 
the  error  between  the  reconstructed  patches  and  the  originals  [1,5]. 

Image  Restoration:  In  this  part  we  average  the  reconstructed  overlapping 
patches  to  restore  the  image.  Such  reconstructed  patches  are  obtained  by 
sparse  coding  with  the  learned  dictionary. 


problems.  We  update  the  dictionary  using  a  variation  of  the 
“Method  of  Optimal  Direction”  (MOD)  [5],  which  updates 
the  dictionary  based  on  the  current  coefficients  to  minimize 
the  error  in  Equation  (1).  In  particular,  let  X  be  a  matrix 
whose  columns  are  the  patches  of  the  image  and  A  be  a  ma¬ 
trix  whose  columns  are  ctij’s.  The  dictionary  that  minimizes 
Equation  (1),  with  A  fixed  (and  ignoring  the  atom  normaliza¬ 
tion  constraint),  is  D  =  XA^(AA^)“^.  See  [7]  for  more 
details  on  core  components  of  the  used  optimization. 

In  the  last  sparse  coding  step  for  the  actual  image  restora¬ 
tion,  after  the  dictionary  has  been  learned,  the  best  results 
where  obtained  when  imposing  Hctijllo  <  L-  We  then  ap¬ 
plied  the  orthogonal  variation  of  matching  pursuit  (OMP)  [8] 
with  L  =  2.  This  combination  of  li  (via  LARS)  with  MOD 
at  the  learning  stage  and  (q  with  OMP  at  the  restoration  step 
has  been  experimentally  found  to  be  optimal  for  this  and  other 
image  datasets  we  have  tested  with.' 

Now  that  we  have  an  algorithm  for  image  denoising  based 
on  sparse  representation,  we  can  use  it  to  denoise  3D  surfaces. 
In  order  to  find  the  3D  surface  we  can  simply  assign  a  point 
(x,  y,  z)  to  each  foreground  pixel  (x,  y)  in  the  image  whose 
intensity  is  z.  The  collection  of  these  points  makes  the  re¬ 
stored  3D  shape. 

3.  FILLING  MISSING  INFORMATION 

Similar  to  images,  in  scanning  3D  data  occlusion  or  missing 
information  can  occur.  We  now  investigate  methods  for  fill¬ 
ing/inpainting  the  holes  in  3D  shape,  assuming  the  location  of 
the  holes  is  known.^  In  [2],  the  problem  of  image  inpainting 
is  investigated  using  the  sparse  representations.  Based  on  this 
work,  we  address  this  problem  for  3D  range  data. 

The  main  idea  in  order  to  fill  holes  is  to  disregard  or  re¬ 
duce  the  effect  of  the  hole  pixels  in  the  error  component  of 
Equation  (2)  when  updating  the  dictionary  and  coefficients  in 
the  algorithm.  In  the  first  two  steps  of  Algorithm  1,  which  are 
the  “Dictionary  Learning”  stage,  we  remove  all  the  patches 

*  We  thank  Julien  Mairal  for  proposing  this  combination  and  very  exhaus¬ 
tive  testing  supporting  it. 

^This  can  often  be  easily  detected  as  lack  of  signal. 


which  have  missing  information,  avoiding  learning  these  ir¬ 
regular  structures  in  the  dictionary.  In  the  last  step,  “Image 
Restoration,”  in  the  sparse  coding  part  to  find  the  optimum 
coefficients  we  define  a  new  objective  function: 

min  ||RyW  0  (Da^-  -  Ryx)||2  :  \\aij\\o  <  L,  (3) 

where  W  is  an  adaptive  matrix  of  weights  corresponding  to 
each  pixel,  see  below.  In  this  case,  we  first  subtract  the  DC 
value  of  each  patch  before  estimating  the  coefficients  a^’s 
and  add  them  to  the  estimated  patch  in  the  reconstruction  step. 
For  the  patches  containing  holes  we  set  the  average  value  of 
the  non-hole  pixels  as  the  DC  of  the  patch.  In  order  to  denoise 
the  image  and  fill  the  missing  information  (holes)  we  apply 
Algorithm  2  on  the  image  obtained  from  the  damaged  data. 

Algorithm  2  Iterative  algorithm  for  filling  holes  on  3D  sur¬ 
faces _ 

Initialization:  Let  W  be  the  matrix  of  weights,  which  has  value  zero  for 
the  hole  pixels  and  one  for  the  rest. 

Image  Restoration:  Find  the  coefficients  that  minimize  Equation  (3)  for 
a  given  x,  and  reconstruct  the  image  based  on  these  coefficients. 

Hole  Restoration:  Find  the  coefficients  that  minimize  Equation  (3)  for 
X  being  the  restored  image  in  the  previous  step,  and  reconstruct  only  the 
holes  based  on  these  coefficients,  avoiding  over-smoothing  in  the  rest  of 
the  image. 

Update  Weights:  Increase  the  weights  of  all  the  hole  pixels  by  ruj,  (in  our 
case  Wfi  =  |). 

Image  Restoration:  Find  the  coefficients  that  minimize  Equation  (3)  for 
X  being  the  restored  image  in  the  previous  restoration  step,  and  reconstruct 
the  image  based  on  these  coefficients. 


4.  EXPERIMENTAL  RESULTS 

In  our  experimental  results,  we  apply  the  proposed  range-data 
restoration  framework  to  data  obtained  by  a  structured-light 
3D  scanner  from  some  toys.  This  scanner  finds  the  depth  of 
each  point  based  on  the  image  of  the  object  after  some  hor¬ 
izontal  stripes  projected  on  it.  Because  of  these  stripes,  an 
additional  noise  in  the  collected  data  with  the  shape  of  hor¬ 
izontal  lines  is  added  to  the  shape  (Fig.  1,  third  column). 
In  some  parts  of  the  shape  these  lines  are  deeper  and  more 
difficult  to  remove.  Also,  since  they  exist  in  all  the  shapes, 
repetitive  noise  might  be  learned  in  the  “Dictionary  Learning” 
process.  In  these  experiments,  after  collecting  the  data  and 
converting  the  shape  to  an  image,  we  normalized  the  inten¬ 
sity  values  and  set  the  background  to  zero.  In  order  to  avoid 
distortions  around  the  boundary  of  the  shape,  we  reflected  the 
values  of  the  pixels  close  to  the  boundary  inside  the  shape  to 
the  pixels  outside  the  shape.  We  added  some  random  holes  as 
patches  of  size  10  x  10  pixels  to  each  image  to  represent  the 
occlusions.  In  both  the  “Dictionary  Learning”  and  the  “Image 
Restoration”  steps  we  used  all  the  patches  of  size  15  x  15  in 
the  image.  In  Equation  (2),  A  was  experimentally  set  to  0.16. 
The  number  of  atoms  in  the  dictionary  was  k  =  500,  which 
makes  the  dictionary  over-complete.  We  set  J  =  10  in  the 
“Dictionary  Learning,”  and  got  the  best  results  with  L  =  2  in 
Equation  (3). 


In  Eig.  1,  some  examples  of  the  scanned  objects  as  image 
and  3D  are  presented.  After  applying  Algorithm  2  to  these 
shapes,  the  holes  were  filled  and  all  the  noise  was  removed 
except  for  some  lines  (caused  by  the  scanning  method)  still 
left  on  the  shapes.  In  order  to  reduce  these  residual  imperfec¬ 
tions,  we  applied  Algorithm  2  again  on  the  original  shape  but 
with  a  dictionary  learned  from  the  restored  image.  Eor  most  of 
the  shapes  reapplying  Algorithm  2  improved  the  results.  Eor 
further  improving  the  results,  we  learned  a  dictionary  on  the 
restored  images  obtained  from  12  shapes  and  used  this  global 
dictionary  to  denoise  the  original  shapes  with  Algorithm  2. 
The  effect  of  applying  global  dictionary  was  different  on  dif¬ 
ferent  shapes,  a  line  was  added  back  for  the  two  dogs,  im¬ 
proved  results  were  obtained  for  the  pig  and  not  a  significant 
change  was  observed  for  the  other  two  shapes.  Einally,  Eigure 
2  shows  some  of  the  learned  dictionaries  for  dog  shape,  pig 
shape,  and  all  12  shapes  (the  global  one).  Note  that  the  hori¬ 
zontal  lines  had  the  most  influence  on  the  dictionary  learned 
on  the  pig  and  the  least  influence  on  the  dictionary  learned  on 
the  dog  which  explains  the  behavior  of  the  denoising  of  these 
shapes  before  and  after  applying  the  global  dictionary. 

5.  CONCLUSIONS 

In  this  paper,  we  introduced  a  new  framework  for  the  restora¬ 
tion  of  3D  range  data.  We  applied  sparse  representation  meth¬ 
ods  on  images  obtained  from  3D  surfaces  in  order  to  both  de¬ 
noise  and  fill  the  occluded  parts  of  the  shapes.  In  our  experi¬ 
mental  results  we  tested  these  methods  on  data  obtained  from 
a  low-cost  structured-light  range  scanner.  Our  experimental 
results  demonstrate  the  effectiveness  of  these  methods  in  de¬ 
noising  and  filling  the  missing  information  of  the  3D  surfaces. 
We  are  currently  working  on  the  challenges  of  extending  this 
work  to  full  3D  shapes. 
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Fig.  1.  Results  of  the  proposed  method  in  Algorithm  2  on  five  shapes  in  the  dataset.  From  left  to  right,  first  column  shows  a  picture  of  the 
objects,  second  column  shows  the  3D  shape  obtained  from  the  3D  scanner,  and  third  column  is  the  converted  range-data  image  of  the  shapes 
in  column  two,  see  the  horizontal  lines.  The  fourth  column  shows  the  shifted  intensity  value  (depth)  of  the  pixels  on  the  three  lines  shown  in 
the  images  on  the  third  column.  The  fifth  column  shows  the  restored  shapes  after  the  second  run  of  Algorithm  2  with  the  dictionary  learned 
on  the  restored  image  in  the  first  run.  The  sixth  column  is  the  result  of  the  third  run  of  Algorithm  2  with  the  dictionary  learned  on  the  restored 
images  of  12  shapes.  (This  is  a  color  figure.) 


Fig.  2.  Learned  range  data  dictionaries  for  12  shapes,  dog  shape,  and  pig  shape,  respectively. 


